Overview

Dataset statistics

Number of variables13
Number of observations178
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.2 KiB
Average record size in memory104.7 B

Variable types

NUM13

Reproduction

Analysis started2020-06-03 16:20:24.812668
Analysis finished2020-06-03 16:20:44.689113
Duration19.88 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Variables

alcohol
Real number (ℝ≥0)

Distinct count126
Unique (%)70.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.00061797752809
Minimum11.03
Maximum14.83
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum11.03
5-th percentile11.6585
Q112.3625
median13.05
Q313.6775
95-th percentile14.2215
Maximum14.83
Range3.8
Interquartile range (IQR)1.315

Descriptive statistics

Standard deviation0.811826538
Coefficient of variation (CV)0.06244522679
Kurtosis-0.8524995685
Mean13.00061798
Median Absolute Deviation (MAD)0.68
Skewness-0.05148233108
Sum2314.11
Variance0.6590623278
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
12.3763.4%
 
13.0563.4%
 
12.0852.8%
 
12.2942.2%
 
1231.7%
 
12.2531.7%
 
12.4231.7%
 
12.9321.1%
 
12.621.1%
 
12.8521.1%
 
14.121.1%
 
13.1621.1%
 
14.0621.1%
 
13.8821.1%
 
13.5621.1%
 
14.3821.1%
 
13.8621.1%
 
13.1721.1%
 
12.7721.1%
 
13.5821.1%
 
13.4921.1%
 
12.7221.1%
 
12.5121.1%
 
12.721.1%
 
13.7121.1%
 
Other values (101)11262.9%
 
ValueCountFrequency (%) 
11.0310.6%
 
11.4110.6%
 
11.4510.6%
 
11.4610.6%
 
11.5610.6%
 
11.6110.6%
 
11.6210.6%
 
11.6410.6%
 
11.6510.6%
 
11.6610.6%
 
ValueCountFrequency (%) 
14.8310.6%
 
14.7510.6%
 
14.3910.6%
 
14.3821.1%
 
14.3710.6%
 
14.3410.6%
 
14.310.6%
 
14.2310.6%
 
14.2221.1%
 
14.2110.6%
 

malic_acid
Real number (ℝ≥0)

Distinct count133
Unique (%)74.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.3363483146067416
Minimum0.74
Maximum5.8
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum0.74
5-th percentile1.061
Q11.6025
median1.865
Q33.0825
95-th percentile4.4555
Maximum5.8
Range5.06
Interquartile range (IQR)1.48

Descriptive statistics

Standard deviation1.117146098
Coefficient of variation (CV)0.478159053
Kurtosis0.2992066799
Mean2.336348315
Median Absolute Deviation (MAD)0.52
Skewness1.039651193
Sum415.87
Variance1.248015403
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1.7373.9%
 
1.8142.2%
 
1.6742.2%
 
1.6831.7%
 
1.6131.7%
 
1.5131.7%
 
1.3531.7%
 
1.5331.7%
 
1.931.7%
 
3.1721.1%
 
3.0321.1%
 
3.4321.1%
 
2.1621.1%
 
2.5921.1%
 
1.6421.1%
 
2.0521.1%
 
1.6521.1%
 
1.1321.1%
 
1.8321.1%
 
1.7121.1%
 
1.8721.1%
 
1.6321.1%
 
1.6621.1%
 
1.7221.1%
 
3.5921.1%
 
Other values (108)11363.5%
 
ValueCountFrequency (%) 
0.7410.6%
 
0.8910.6%
 
0.910.6%
 
0.9210.6%
 
0.9421.1%
 
0.9810.6%
 
0.9910.6%
 
1.0110.6%
 
1.0710.6%
 
1.0910.6%
 
ValueCountFrequency (%) 
5.810.6%
 
5.6510.6%
 
5.5110.6%
 
5.1910.6%
 
5.0410.6%
 
4.9510.6%
 
4.7210.6%
 
4.6110.6%
 
4.610.6%
 
4.4310.6%
 

ash
Real number (ℝ≥0)

Distinct count79
Unique (%)44.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.3665168539325845
Minimum1.36
Maximum3.23
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum1.36
5-th percentile1.92
Q12.21
median2.36
Q32.5575
95-th percentile2.7415
Maximum3.23
Range1.87
Interquartile range (IQR)0.3475

Descriptive statistics

Standard deviation0.2743440091
Coefficient of variation (CV)0.1159273422
Kurtosis1.143978169
Mean2.366516854
Median Absolute Deviation (MAD)0.16
Skewness-0.1766993165
Sum421.24
Variance0.07526463531
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2.373.9%
 
2.2873.9%
 
2.763.4%
 
2.3663.4%
 
2.3263.4%
 
2.4852.8%
 
2.252.8%
 
2.3852.8%
 
2.542.2%
 
2.442.2%
 
2.142.2%
 
2.6242.2%
 
2.2131.7%
 
2.2731.7%
 
2.4531.7%
 
2.1731.7%
 
2.6131.7%
 
2.631.7%
 
1.9231.7%
 
2.3531.7%
 
2.2631.7%
 
2.1231.7%
 
1.9831.7%
 
2.4231.7%
 
2.4631.7%
 
Other values (54)7642.7%
 
ValueCountFrequency (%) 
1.3610.6%
 
1.721.1%
 
1.7110.6%
 
1.7510.6%
 
1.8210.6%
 
1.8810.6%
 
1.910.6%
 
1.9231.7%
 
1.9410.6%
 
1.9510.6%
 
ValueCountFrequency (%) 
3.2310.6%
 
3.2210.6%
 
2.9210.6%
 
2.8710.6%
 
2.8610.6%
 
2.8410.6%
 
2.810.6%
 
2.7810.6%
 
2.7510.6%
 
2.7421.1%
 

alcalinity_of_ash
Real number (ℝ≥0)

Distinct count63
Unique (%)35.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.49494382022472
Minimum10.6
Maximum30.0
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum10.6
5-th percentile14.77
Q117.2
median19.5
Q321.5
95-th percentile25
Maximum30
Range19.4
Interquartile range (IQR)4.3

Descriptive statistics

Standard deviation3.339563767
Coefficient of variation (CV)0.171304098
Kurtosis0.4879415405
Mean19.49494382
Median Absolute Deviation (MAD)2.05
Skewness0.2130468864
Sum3470.1
Variance11.15268616
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
20158.4%
 
21116.2%
 
16116.2%
 
18105.6%
 
1995.1%
 
21.584.5%
 
18.573.9%
 
2273.9%
 
19.573.9%
 
22.573.9%
 
2552.8%
 
16.852.8%
 
2452.8%
 
20.542.2%
 
1731.7%
 
17.231.7%
 
18.831.7%
 
17.531.7%
 
24.531.7%
 
2331.7%
 
28.521.1%
 
18.621.1%
 
15.221.1%
 
15.521.1%
 
1521.1%
 
Other values (38)3921.9%
 
ValueCountFrequency (%) 
10.610.6%
 
11.210.6%
 
11.410.6%
 
1210.6%
 
12.410.6%
 
13.210.6%
 
1421.1%
 
14.610.6%
 
14.810.6%
 
1521.1%
 
ValueCountFrequency (%) 
3010.6%
 
28.521.1%
 
2710.6%
 
26.510.6%
 
2610.6%
 
25.510.6%
 
2552.8%
 
24.531.7%
 
2452.8%
 
23.610.6%
 

magnesium
Real number (ℝ≥0)

Distinct count53
Unique (%)29.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean99.74157303370787
Minimum70.0
Maximum162.0
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum70
5-th percentile80.85
Q188
median98
Q3107
95-th percentile124.3
Maximum162
Range92
Interquartile range (IQR)19

Descriptive statistics

Standard deviation14.28248352
Coefficient of variation (CV)0.1431948894
Kurtosis2.104991324
Mean99.74157303
Median Absolute Deviation (MAD)10
Skewness1.098191055
Sum17754
Variance203.9893354
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
88137.3%
 
86116.2%
 
9895.1%
 
10195.1%
 
9684.5%
 
10273.9%
 
11263.4%
 
8563.4%
 
9463.4%
 
8052.8%
 
9252.8%
 
8952.8%
 
9752.8%
 
10352.8%
 
10742.2%
 
10642.2%
 
9042.2%
 
10842.2%
 
10431.7%
 
11131.7%
 
7831.7%
 
11631.7%
 
9531.7%
 
12031.7%
 
11031.7%
 
Other values (28)4123.0%
 
ValueCountFrequency (%) 
7010.6%
 
7831.7%
 
8052.8%
 
8110.6%
 
8210.6%
 
8431.7%
 
8563.4%
 
86116.2%
 
8731.7%
 
88137.3%
 
ValueCountFrequency (%) 
16210.6%
 
15110.6%
 
13910.6%
 
13610.6%
 
13410.6%
 
13210.6%
 
12810.6%
 
12710.6%
 
12610.6%
 
12410.6%
 

total_phenols
Real number (ℝ≥0)

Distinct count97
Unique (%)54.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.295112359550562
Minimum0.98
Maximum3.88
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum0.98
5-th percentile1.38
Q11.7425
median2.355
Q32.8
95-th percentile3.2745
Maximum3.88
Range2.9
Interquartile range (IQR)1.0575

Descriptive statistics

Standard deviation0.6258510488
Coefficient of variation (CV)0.2726886317
Kurtosis-0.8356265234
Mean2.29511236
Median Absolute Deviation (MAD)0.505
Skewness0.0866385864
Sum408.53
Variance0.3916895353
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2.284.5%
 
363.4%
 
2.863.4%
 
2.663.4%
 
252.8%
 
2.9552.8%
 
1.3842.2%
 
1.6542.2%
 
2.4542.2%
 
2.8542.2%
 
1.731.7%
 
1.831.7%
 
2.4231.7%
 
1.6831.7%
 
3.331.7%
 
1.4831.7%
 
2.531.7%
 
1.9831.7%
 
2.721.1%
 
2.6521.1%
 
3.1821.1%
 
1.421.1%
 
2.5321.1%
 
1.5521.1%
 
2.5521.1%
 
Other values (72)8849.4%
 
ValueCountFrequency (%) 
0.9810.6%
 
1.110.6%
 
1.1510.6%
 
1.2510.6%
 
1.2810.6%
 
1.310.6%
 
1.3510.6%
 
1.3842.2%
 
1.3921.1%
 
1.421.1%
 
ValueCountFrequency (%) 
3.8810.6%
 
3.8510.6%
 
3.5210.6%
 
3.510.6%
 
3.410.6%
 
3.3810.6%
 
3.331.7%
 
3.2710.6%
 
3.2521.1%
 
3.210.6%
 

flavanoids
Real number (ℝ≥0)

Distinct count132
Unique (%)74.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0292696629213487
Minimum0.34
Maximum5.08
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum0.34
5-th percentile0.5455
Q11.205
median2.135
Q32.875
95-th percentile3.4975
Maximum5.08
Range4.74
Interquartile range (IQR)1.67

Descriptive statistics

Standard deviation0.998858685
Coefficient of variation (CV)0.4922257023
Kurtosis-0.8803815472
Mean2.029269663
Median Absolute Deviation (MAD)0.835
Skewness0.02534355338
Sum361.21
Variance0.9977186726
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2.6542.2%
 
0.5831.7%
 
2.6831.7%
 
0.631.7%
 
1.2531.7%
 
2.0331.7%
 
0.9221.1%
 
0.6621.1%
 
2.4321.1%
 
2.9821.1%
 
0.4721.1%
 
2.2621.1%
 
1.6921.1%
 
2.1721.1%
 
2.7921.1%
 
2.7621.1%
 
2.9221.1%
 
3.1721.1%
 
1.3621.1%
 
3.3921.1%
 
3.1521.1%
 
1.5921.1%
 
1.8421.1%
 
2.6921.1%
 
2.9921.1%
 
Other values (107)12168.0%
 
ValueCountFrequency (%) 
0.3410.6%
 
0.4721.1%
 
0.4810.6%
 
0.4910.6%
 
0.521.1%
 
0.5110.6%
 
0.5210.6%
 
0.5510.6%
 
0.5610.6%
 
0.5710.6%
 
ValueCountFrequency (%) 
5.0810.6%
 
3.9310.6%
 
3.7510.6%
 
3.7410.6%
 
3.6910.6%
 
3.6710.6%
 
3.6410.6%
 
3.5610.6%
 
3.5410.6%
 
3.4910.6%
 

nonflavanoid_phenols
Real number (ℝ≥0)

Distinct count39
Unique (%)21.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3618539325842696
Minimum0.13
Maximum0.66
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum0.13
5-th percentile0.19
Q10.27
median0.34
Q30.4375
95-th percentile0.6
Maximum0.66
Range0.53
Interquartile range (IQR)0.1675

Descriptive statistics

Standard deviation0.1244533403
Coefficient of variation (CV)0.3439325349
Kurtosis-0.6371910641
Mean0.3618539326
Median Absolute Deviation (MAD)0.085
Skewness0.4501513356
Sum64.41
Variance0.01548863391
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.26116.2%
 
0.43116.2%
 
0.29105.6%
 
0.3295.1%
 
0.384.5%
 
0.3784.5%
 
0.3484.5%
 
0.2784.5%
 
0.484.5%
 
0.2473.9%
 
0.5373.9%
 
0.2163.4%
 
0.2263.4%
 
0.2852.8%
 
0.3952.8%
 
0.1752.8%
 
0.552.8%
 
0.5252.8%
 
0.4742.2%
 
0.4242.2%
 
0.4842.2%
 
0.6342.2%
 
0.5831.7%
 
0.631.7%
 
0.4531.7%
 
Other values (14)2111.8%
 
ValueCountFrequency (%) 
0.1310.6%
 
0.1421.1%
 
0.1752.8%
 
0.1921.1%
 
0.221.1%
 
0.2163.4%
 
0.2263.4%
 
0.2473.9%
 
0.2521.1%
 
0.26116.2%
 
ValueCountFrequency (%) 
0.6610.6%
 
0.6342.2%
 
0.6131.7%
 
0.631.7%
 
0.5831.7%
 
0.5610.6%
 
0.5510.6%
 
0.5373.9%
 
0.5252.8%
 
0.552.8%
 

proanthocyanins
Real number (ℝ≥0)

Distinct count101
Unique (%)56.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5908988764044945
Minimum0.41
Maximum3.58
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum0.41
5-th percentile0.73
Q11.25
median1.555
Q31.95
95-th percentile2.709
Maximum3.58
Range3.17
Interquartile range (IQR)0.7

Descriptive statistics

Standard deviation0.5723588627
Coefficient of variation (CV)0.3597707379
Kurtosis0.5546485226
Mean1.590898876
Median Absolute Deviation (MAD)0.38
Skewness0.5171371723
Sum283.18
Variance0.3275946677
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1.3595.1%
 
1.4673.9%
 
1.8763.4%
 
1.2552.8%
 
1.5642.2%
 
1.6642.2%
 
1.9842.2%
 
2.0842.2%
 
1.7731.7%
 
1.6331.7%
 
1.9531.7%
 
2.2931.7%
 
1.431.7%
 
2.8131.7%
 
0.8331.7%
 
1.6231.7%
 
2.3831.7%
 
1.9731.7%
 
0.9431.7%
 
1.0331.7%
 
1.1431.7%
 
1.0431.7%
 
1.1521.1%
 
1.4221.1%
 
1.8621.1%
 
Other values (76)8748.9%
 
ValueCountFrequency (%) 
0.4110.6%
 
0.4221.1%
 
0.5510.6%
 
0.6210.6%
 
0.6421.1%
 
0.6810.6%
 
0.7321.1%
 
0.7510.6%
 
0.821.1%
 
0.8110.6%
 
ValueCountFrequency (%) 
3.5810.6%
 
3.2810.6%
 
2.9610.6%
 
2.9121.1%
 
2.8131.7%
 
2.7610.6%
 
2.710.6%
 
2.510.6%
 
2.4910.6%
 
2.4510.6%
 

color_intensity
Real number (ℝ≥0)

Distinct count132
Unique (%)74.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.058089882022472
Minimum1.28
Maximum13.0
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum1.28
5-th percentile2.114
Q13.22
median4.69
Q36.2
95-th percentile9.598
Maximum13
Range11.72
Interquartile range (IQR)2.98

Descriptive statistics

Standard deviation2.318285872
Coefficient of variation (CV)0.4583322807
Kurtosis0.3815222728
Mean5.058089882
Median Absolute Deviation (MAD)1.51
Skewness0.868584791
Sum900.339999
Variance5.374449383
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2.642.2%
 
4.642.2%
 
3.842.2%
 
3.431.7%
 
3.0531.7%
 
2.931.7%
 
531.7%
 
4.531.7%
 
5.731.7%
 
2.831.7%
 
5.631.7%
 
5.431.7%
 
5.131.7%
 
7.321.1%
 
2.0621.1%
 
1.9521.1%
 
4.821.1%
 
7.121.1%
 
6.221.1%
 
2.721.1%
 
7.6521.1%
 
2.4521.1%
 
3.321.1%
 
2.6521.1%
 
4.921.1%
 
Other values (107)11262.9%
 
ValueCountFrequency (%) 
1.2810.6%
 
1.7410.6%
 
1.910.6%
 
1.9521.1%
 
210.6%
 
2.0621.1%
 
2.0810.6%
 
2.1210.6%
 
2.1510.6%
 
2.210.6%
 
ValueCountFrequency (%) 
1310.6%
 
11.7510.6%
 
10.810.6%
 
10.6810.6%
 
10.5210.6%
 
10.2610.6%
 
10.210.6%
 
9.89999910.6%
 
9.710.6%
 
9.5810.6%
 

hue
Real number (ℝ≥0)

Distinct count78
Unique (%)43.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9574494382022471
Minimum0.48
Maximum1.71
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum0.48
5-th percentile0.57
Q10.7825
median0.965
Q31.12
95-th percentile1.2845
Maximum1.71
Range1.23
Interquartile range (IQR)0.3375

Descriptive statistics

Standard deviation0.2285715658
Coefficient of variation (CV)0.2387296464
Kurtosis-0.3440957414
Mean0.9574494382
Median Absolute Deviation (MAD)0.165
Skewness0.0210912722
Sum170.426
Variance0.05224496071
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1.0484.5%
 
1.2373.9%
 
1.1263.4%
 
0.8952.8%
 
0.5752.8%
 
0.9652.8%
 
1.2552.8%
 
1.0542.2%
 
1.0942.2%
 
0.7542.2%
 
0.8642.2%
 
0.742.2%
 
1.0742.2%
 
1.1942.2%
 
0.8831.7%
 
0.9531.7%
 
0.9131.7%
 
0.9831.7%
 
1.1331.7%
 
1.0231.7%
 
1.1631.7%
 
0.9331.7%
 
0.631.7%
 
1.0631.7%
 
0.9231.7%
 
Other values (53)7642.7%
 
ValueCountFrequency (%) 
0.4810.6%
 
0.5410.6%
 
0.5510.6%
 
0.5621.1%
 
0.5752.8%
 
0.5821.1%
 
0.5921.1%
 
0.631.7%
 
0.6121.1%
 
0.6210.6%
 
ValueCountFrequency (%) 
1.7110.6%
 
1.4510.6%
 
1.4210.6%
 
1.3810.6%
 
1.3621.1%
 
1.3310.6%
 
1.3121.1%
 
1.2821.1%
 
1.2710.6%
 
1.2552.8%
 

od280/od315_of_diluted_wines
Real number (ℝ≥0)

Distinct count122
Unique (%)68.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.6116853932584267
Minimum1.27
Maximum4.0
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum1.27
5-th percentile1.4625
Q11.9375
median2.78
Q33.17
95-th percentile3.58
Maximum4
Range2.73
Interquartile range (IQR)1.2325

Descriptive statistics

Standard deviation0.7099904288
Coefficient of variation (CV)0.2718514376
Kurtosis-1.086434527
Mean2.611685393
Median Absolute Deviation (MAD)0.52
Skewness-0.307285499
Sum464.88
Variance0.5040864089
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2.8752.8%
 
342.2%
 
1.8242.2%
 
2.7842.2%
 
2.7731.7%
 
1.7531.7%
 
1.3331.7%
 
2.3131.7%
 
3.3331.7%
 
2.9631.7%
 
3.1731.7%
 
1.5631.7%
 
1.5121.1%
 
2.6521.1%
 
3.421.1%
 
2.0621.1%
 
3.2121.1%
 
1.6821.1%
 
2.2621.1%
 
3.321.1%
 
3.5821.1%
 
1.5821.1%
 
3.1621.1%
 
3.2621.1%
 
2.4421.1%
 
Other values (97)11162.4%
 
ValueCountFrequency (%) 
1.2710.6%
 
1.2921.1%
 
1.310.6%
 
1.3331.7%
 
1.3610.6%
 
1.4210.6%
 
1.4710.6%
 
1.4810.6%
 
1.5121.1%
 
1.5510.6%
 
ValueCountFrequency (%) 
410.6%
 
3.9210.6%
 
3.8210.6%
 
3.7110.6%
 
3.6910.6%
 
3.6410.6%
 
3.6310.6%
 
3.5910.6%
 
3.5821.1%
 
3.5710.6%
 

proline
Real number (ℝ≥0)

Distinct count121
Unique (%)68.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean746.8932584269663
Minimum278.0
Maximum1680.0
Zeros0
Zeros (%)0.0%
Memory size1.5 KiB

Quantile statistics

Minimum278
5-th percentile354.55
Q1500.5
median673.5
Q3985
95-th percentile1297.25
Maximum1680
Range1402
Interquartile range (IQR)484.5

Descriptive statistics

Standard deviation314.9074743
Coefficient of variation (CV)0.4216231312
Kurtosis-0.2484031061
Mean746.8932584
Median Absolute Deviation (MAD)202.5
Skewness0.7678217814
Sum132947
Variance99166.71736
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
68052.8%
 
52052.8%
 
75042.2%
 
63042.2%
 
62542.2%
 
49531.7%
 
56231.7%
 
45031.7%
 
48031.7%
 
66031.7%
 
128531.7%
 
103531.7%
 
51031.7%
 
65021.1%
 
55021.1%
 
60021.1%
 
88021.1%
 
56021.1%
 
58021.1%
 
38021.1%
 
106021.1%
 
42821.1%
 
104521.1%
 
50021.1%
 
106521.1%
 
Other values (96)10860.7%
 
ValueCountFrequency (%) 
27810.6%
 
29010.6%
 
31210.6%
 
31510.6%
 
32510.6%
 
34210.6%
 
34521.1%
 
35210.6%
 
35510.6%
 
36510.6%
 
ValueCountFrequency (%) 
168010.6%
 
154710.6%
 
151510.6%
 
151010.6%
 
148010.6%
 
145010.6%
 
137510.6%
 
132010.6%
 
131010.6%
 
129510.6%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

alcoholmalic_acidashalcalinity_of_ashmagnesiumtotal_phenolsflavanoidsnonflavanoid_phenolsproanthocyaninscolor_intensityhueod280/od315_of_diluted_winesproline
014.231.712.4315.6127.02.803.060.282.295.641.043.921065.0
113.201.782.1411.2100.02.652.760.261.284.381.053.401050.0
213.162.362.6718.6101.02.803.240.302.815.681.033.171185.0
314.371.952.5016.8113.03.853.490.242.187.800.863.451480.0
413.242.592.8721.0118.02.802.690.391.824.321.042.93735.0
514.201.762.4515.2112.03.273.390.341.976.751.052.851450.0
614.391.872.4514.696.02.502.520.301.985.251.023.581290.0
714.062.152.6117.6121.02.602.510.311.255.051.063.581295.0
814.831.642.1714.097.02.802.980.291.985.201.082.851045.0
913.861.352.2716.098.02.983.150.221.857.221.013.551045.0

Last rows

alcoholmalic_acidashalcalinity_of_ashmagnesiumtotal_phenolsflavanoidsnonflavanoid_phenolsproanthocyaninscolor_intensityhueod280/od315_of_diluted_winesproline
16813.582.582.6924.5105.01.550.840.391.548.6600000.741.80750.0
16913.404.602.8625.0112.01.980.960.271.118.5000000.671.92630.0
17012.203.032.3219.096.01.250.490.400.735.5000000.661.83510.0
17112.772.392.2819.586.01.390.510.480.649.8999990.571.63470.0
17214.162.512.4820.091.01.680.700.441.249.7000000.621.71660.0
17313.715.652.4520.595.01.680.610.521.067.7000000.641.74740.0
17413.403.912.4823.0102.01.800.750.431.417.3000000.701.56750.0
17513.274.282.2620.0120.01.590.690.431.3510.2000000.591.56835.0
17613.172.592.3720.0120.01.650.680.531.469.3000000.601.62840.0
17714.134.102.7424.596.02.050.760.561.359.2000000.611.60560.0